Underspecifying and Predicting Voice for Surface Realisation Ranking

نویسندگان

  • Sina Zarrieß
  • Aoife Cahill
  • Jonas Kuhn
چکیده

This paper addresses a data-driven surface realisation model based on a large-scale reversible grammar of German. We investigate the relationship between the surface realisation performance and the character of the input to generation, i.e. its degree of underspecification. We extend a syntactic surface realisation system, which can be trained to choose among word order variants, such that the candidate set includes active and passive variants. This allows us to study the interaction of voice and word order alternations in realistic German corpus data. We show that with an appropriately underspecified input, a linguistically informed realisation model trained to regenerate strings from the underlying semantic representation achieves 91.5% accuracy (over a baseline of 82.5%) in the prediction of the original voice.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Designing Features for Parse Disambiguation and Realisation Ranking

We present log-linear models for use in the tasks of parse disambiguation and realisation ranking in German. Forst (2007a) shows that by extending the set of features used in parse disambiguation to include more linguistically motivated information, disambiguation results can be significantly improved for German data. The question we address in this paper is to what extent this improved set of ...

متن کامل

Stochastic Realisation Ranking for a Free Word Order Language

We present a log-linear model that is used for ranking the string realisations produced for given corpus f-structures by a reversible broadcoverage LFG for German and compare its results with the ones achieved by the application of a language model (LM). Like other authors that have developed log-linear models for realisation ranking, we use a hybrid model that uses linguistically motivated lea...

متن کامل

Incorporating Information Status into Generation Ranking

We investigate the influence of information status (IS) on constituent order in German, and integrate our findings into a loglinear surface realisation ranking model. We show that the distribution of pairs of IS categories is strongly asymmetric. Moreover, each category is correlated with morphosyntactic features, which can be automatically detected. We build a loglinear model that incorporates...

متن کامل

To what extent does sentence-internal realisation reflect discourse context? A study on word order

We compare the impact of sentenceinternal vs. sentence-external features on word order prediction in two generation settings: starting out from a discriminative surface realisation ranking model for an LFG grammar of German, we enrich the feature set with lexical chain features from the discourse context which can be robustly detected and reflect rough grammatical correlates of notions from the...

متن کامل

Prediction and Realisation of Conversational Characteristics by Utilising Spontaneous Speech for Unit Selection

Unit selection speech synthesis has reached high levels of naturalness and intelligibility for neutral read aloud speech. However, synthetic speech generated using neutral read aloud data lacks all the attitude, intention and spontaneity associated with everyday conversations. Unit selection is heavily data dependent and thus in order to simulate human conversational speech, or create synthetic...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011